!pr2
18-Digit Arithmetic, Part 2................Bob Sander-Cederlof

Feedback on installment one of this series came from as far away as Sweden.  Paul Schlyter, with others, pointed out the omission of three very important letters.  PRINT (14.9*10) indeed prints 149, as expected.  What I meant to say was that PRINT INT(14.9*10) prints 148.

I noticed another error at the top of page 21.  The exponent range runs from 10^-63 thru 10^63, not 10^64.

Paul pointed out that my routines did not check for underflow and overflow.  I did have such checks in another part of the code, as yet unlisted, but I now agree with him that some checks belong in the routines printed last month.

The subroutine SHIFT.DAC.RIGHT.ONE is called when a carry beyond the most significant bit is detected in DADD, at line 1620.  If the exponent is already 10^63, or $7F, this shift right will cause overflow.  That means the sum formed by DADD is greater than 10^63, and we need to do either of two things.  My usual choice, assuming the routines are being used from Applesoft, is to JMP directly to the Applesoft ROM overflow error routine, at $E8D5.  Another option is to set the DAC exponent to $7F, and the mantissa to all 9's.  To implement it my way, add these lines:

       1945       BMI .2

       2085 .2    JMP $E8D5

Underflow needs to be tested in the NORMALIZE.DAC subroutine.  Underlofw happens when the exponent falls below 10^-63.  The normal procedure upon underflow is to set the result to zero.  Zero values in DP18 are indicated by the exponent being zero, regardless of the mantissa value.  Delete lines 2400-2480 and line 2730, and enter the following lines

       2400        LDY #-1
       2410 .1     INY
       2420        CPY #10
       2430        BCS .7
       2440        LDA DAC.HI,Y
       2450        BEQ .1

       2730 .6     LDA DAC.EXPONENT
       2731        BPL .8
       2732 .7     LDA #0
       2733        STA DAC.EXPONENT
       2734        STA DAC.SIGN
       2735 .8     RTS

All these changes will be installed on Quarterly Disk 15.

This month I want to present several pack and unpack subroutines, and one which rounds the value in DAC according to the value in the extension byte.

Note that I have just LISTed the subroutines below, rather than showing the assembly listing, because the program parts need to all be assembled together before they are meaningful.

There are two "unpack" subroutines, MOVE.YA.DAC and MOVE.YA.ARG.  They perform the "load accumulator" function.  There is one "pack" subroutine, MOVE.DAC.YA, which performs the "store accumulator" function.

The MOVE routines use a page-zero pair at $5E and $5F.  Assuming the DP18 package will be called from Applesoft via the &-vector, there will be no page-zero conflicts here.

The subroutines DADD and DSUB from last month, and DMULT and DDIV to come, all expect two arguments in DAC and ARG and leave the result in DAC.  Assuming there are two packed DP18 value at VAL.A and VAL.B, and that I want to add them together and store the result in VAL.C, I would do it this way:

       LDA #VAL.A
       LDY /VAL.A
       JSR MOVE.YA.DAC
       LDA #VAL.B
       LDY /VAL.B
       JSR MOVE.YA.ARG
       JSR DADD
       LDA #VAL.C
       LDY /VAL.C
       JSR MOVE.DAC.YA

Note that MOVE.DAC.YA calls ROUND.DAC before storing the result.  ROUND.DAC checks the extension byte.  If the extension byte has a value less than $50, no rounding need be done.  If it is $50 through $99, the value in DAC should be rounded up.  If the higher digits are less than .999999999999999999, then there will be no carry beyond the most significant digit, and no chance for overflow.  However, if it is all 9's we will get a final carry and we will need to change the number to 100000000000000000 and add one to the exponent.  In tiny precision, this is like rounding .995 up to 1.00.  If the exponent was already 10^63, rounding up with a final carry causes overflow, so I jump to the Applesoft error handler.
!np    <<<<  MOVE listings here>>>>
None of the pack/unpack code is especially tricky, but the same cannot be said for DMULT.  Multiplication is handled "just like you do it with pencil and paper", but making it happen at all efficiently makes things look very tricky.

Call DMULT after loading the multiplier and multiplicand into DAC and ARG (doesn't matter which is which, because multiplication is commutative).  Then JSR DMULT to perform the multiply.  The result will be left in DAC.

Looking at the DMULT code, lines 1040-1070 handle the special cases of either argument being 0.  Anything times zero is zero, and zero values are indicated by the exponent being zero, so this is real easy.

Lines 1090-1130 clear a temporary register which is 20 bytes long.  This register will be used to accumulate the partial products.  Just in case some of the terminology is losing you, here are some definitions:

!lm+5
       12345  <-- multiplicand
     x 54321  <-- multiplier
   ---------
       12345  <-- 1st partial product
      24690   <-- 2nd partial product
     37035    <-- 3rd    "       "
    49380     <-- 4th    "       "
   61725      <-- 5th    "       "
   ---------
   670592745  <-- product
!lm-5

Lines 1150-1180 form the 20-digit product of the two 10-digit arguments.  I wanted to reduce the number of times the individual digits have to be isolated, or the accumulators shifted by 4-bits, so I used a trick.  Line 1150 calls a subroutine which multiplies the multiplicand (in ARG) by all the low-order digits in each byte of the multiplier (in DAC).  In other words, I accumulate only the odd partial products at this time.  Then I shift DAC 4-bits right, which places the other set of digits in the low-order side of each byte.  I also have to shift the result register, MAC, right 4-bits, and then I call the MULTIPLY.BY.LOW.DIGITS subroutine again.

Lines 1200-1270 form the new exponent, which is the sum of the exponents of the two arguments.  Since both exponents have the value $40 added to make them appear positive, one of the $40's has to be subtracted back out.  But before that, if the sum is above $C0 then we have an overflow condition.  After subtracting out one of the $40's, if the result is negative we have an underflow condition.  Note that since the carry status was clear at line 1250, I subtracted $3F; for one more byte, I could have done it the normal way and used SEC, SBC #$40.

Lines 1290-1310 form the sign of the product, which is the exclusive-or of the signs of the two arguments.  Lines 1330-1370 copy the most significant 10 bytes of the product from MAC to DAC.

The result may have a leading zero digit in the left half of the first byte, so I call NORMALIZE.DAC at line 1390.  If The leading digit was zero, normalizing will shift DAC left one digit position, leaving room for another significant digit on the right end.  Lines 1400-1490 handle installing the extra digit if necessary.

MULTIPLY.BY.LOW.DIGITS picks up the low-order digit out of each byte of the multiplier, one-by-one, and calls MULTIPLY.ARG.BY.N.

MULTIPLY.ARG.BY.N does the nitty-gritty multiplication.  And here is where I lost all my ingenuity, too.  The multiplier digit is stored in DIGIT, and used to count down a loop which adds ARG to MAC DIGIT times.  Surely this can be done more efficiently!  How about it Paul?  Or Charlie?  Anyone?

Well, that's all for this month.  Next month expect some simple I/O routines and the divide subroutine.
